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(54) Method and apparatus for load sharing on a wide area network 



(57) Client's (106-1 — 106-N, 107-1 — 107-M) on 
local area networks (102, 103) making requests to hot 
sites, which are connected on a wide area network 
(1 00) such as the Internet, are redirected through one of 
a possible plurality of different redirectors (101,103) to 
one of a possible plurality of caching servers (S1, S2 t 
S3), which each have responsibility for mapping one or 
more of the hot sites. Each request is probabilistically 
directed by one of the redirectors to one of the caching 
servers that map the requested hot site in accordance 
with weights that are determined for that redirector-hot 
site pair so as to minimize the average delay that all cli- 
ent requests across the network will encounter in mak- 
ing requests to all the cached hot sites. In order to 
determine the weights with which each redirector will 
redirect requests to the hot sites to the caching servers, 
statistics of access rates to each hot site are dynami- 
cally determined by each redirector in the network from 
the traffic flow and reported to a central management 
station (CMS) (115). Network delay is similarly meas- 
ured by each redirector and reported to the CMS, and 
server delay is computed using a queuing model of 
each server. Using these parameters as inputs, a non- 
linear programming optimization problem is solved as a 
network flow problem in order to determine the weights 
for each redirector that will minimize the average delay. 
As the access rate statistics, as well as the network 
delay and server delay, dynamically change, the CMS, 
using the network flow algorithm, recalculates the 
weights and forwards them back to each redirector. In 
other embodiments, the redirector-logical item pair for 
which the redirector probabilistically directs client 
requests may be other than a hot site identity. For exam- 



ple, the logical items can be groups of clients or groups 
of documents, and the servers to which requests are 
forwarded can be web servers or caching servers. 
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Description 
Technical Field 

[0001] This invention relates to wide area networks such as the Internet, and more particularly, to a method and 
apparatus for minimizing the average delay per unit of time of all client requests incurred over all connections estab- 
lished for accessing server devices, such as web servers and proxy cache servers, which are located across the wide 
area network. 

Background of the Invention 

[0002] In co-pending patent application Serial No. 08/953577 entitled "Data Distribution Techniques for Load-Bal- 
anced Fault-Tolerant Web Access", by B. Narendran, S. Rangarajan (co-inventor herein) and S. Yajnik, assigned to the 
assignee of the present application, and which is incorporated herein by reference, a methodology is described for bal- 
ancing the load on a set of devices connected on a wide area network such as the Internet. Specifically, the methodol- 
ogy for load balancing described in that application provides, in a first phase, an algorithm for distributing web 
documents, or objects, onto different servers such that the total access rates to each server (equal to the total number 
of connection requests that a server handles per time unit) are balanced across all the servers. Further, in a second 
phase of the methodology, a network flow-based algorithm is used to re-balance the access rates to each server in 
response to system changes without moving objects between the different servers. 

[0003] In further detail, in the first phase of the load balancing methodology, logical items, such as the web docu- 
ments, or objects, are mapped to different physical devices such as web servers, cache servers, ftp servers, etc.. based 
on the a priori access rates that are known for requests from/to these web documents. This mapping, referred to the 
initial distribution, takes as an input the access rates of each web document, the number of replicas of these web doc- 
uments that need to be made on the physical devices, such as document servers or caches, and the capacity of each 
of the physical devices, and produces a mapping between the web documents and the physical devices. It also pro- 
duces as an output the probabilities (or weights) that will then be used by a redirection server to redirect requests 
from/to replicated web documents to one of the physical devices to which they are mapped. This initial distribution map- 
ping is performed such that the load is balanced among the physical devices, or, i.e., the sum of access rates of 
requests to the web documents redirected to each physical device is balanced across all the devices. Load balance is 
achieved across the physical devices irrespective of the web documents that they handle. 

[0004] In the second phase of the methodology, once the initial distribution of the web documents is performed, any 
change in the system parameters that affects the load balance is handled using a network flow load balance algorithm 
to determine new probabilities (or weights) with which the redirection server will then thereafter redirect requests 
from/to web documents to one of the physical devices to which they are mapped. Thus, instead of re-mapping web doc- 
uments to different documents servers or caches to handle a perturbation in load, the load is re-balanced by changing 
the probability with which requests to each replicated web document is redirected to one of the plurality of physical 
devices to which that physical item is mapped. Examples of parameters that may change in the system include the load 
on each physical device and the capacity of each of the physical devices, the latter of which can instantly become zero 
upon the failure of a device. 

[0005] The goal of load balancing, as described, is to balance across all physical devices the sum of the access 
rates of requests to the web documents redirected to each physical device. The latency, or delay, incurred in providing 
a response from a physical device to a request for a web document made by a client has not been previously consid- 
ered. 

Summary of the Invention 

[0006] In accordance with the present invention, minimization of request latency on a wide area network such as 
the Internet is the goal rather than pure load balancing. The load sharing methodology of the present invention mini- 
mizes delay by determining the probabilities, or weights with which web requests are redirected over the wide area net- 
work to the web servers so as to minimize the average delay of all connections across all servers per unit of time. Such 
redirection to the different servers is effected as a function of a logical item, the logical item being the factor that the 
redirector uses in determining where and with what weights the request is to be directed. In determining a solution to a 
non-linear programming optimization problem, the network delay associated with accessing a server and the server 
delay, which itself is a function of the access rate on that server, are taken into account. After an initial distribution of 
logical items is completed, such as for load balancing purposes in accordance with the aforedescribed prior art meth- 
odology, or another method, the following are determined: (1) the access rates, which are equal to the number of 
requests per unit time associated with each redirector-logical item pair; (2) the network delay, which is equal to the sum 
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of the propagation and the transmission delays between the client and the server; and (3) the server delays incurred in 
processing a web request. Once these parameters are measured or mathematically computed they are used to deter- 
mine the solution of a non-linear program optimization problem. This non-linear programming problem is formulated 
and solved as a minimum cost network flow problem to ultimately determine the probability or distributions, or weights, 

5 with which each redirector in the network will then redirect requests to the different servers which can satisfy them. 
[0007] In a specific embodiment of the present invention, highly popular, a/k/a hot sites, are mapped to particular 
caching servers dispersed in a wide area network, with each hot site being mapped to one or more caching servers. 
Statistics of access rates to each hot site are dynamically determined by each redirector in the network from the traffic 
flow and reported to a central management station (CMS). Network delay is similarly measured by each redirector and 

w reported to the CMS and server delay at each server is computed using a queuing model of each server. Using these 
parameters, the CMS solves a network flow problem to determine the weights with which each redirector will then prob- 
abilistically forward requests for a hot site to the different plural servers which are responsible for that requested site. As 
the access rate statistics, as well as possibly the measured network delay and server delay, dynamically change, the 
CMS, using the network flow algorithm, recalculates the weights and forwards the adjusted weights back to each redi- 

is rector. The weights with which each redirector forwards requests for specific documents to a particular server are there- 
fore continually modified in a manner that minimizes the average delay based on the most recent access rate, network 
delay and server delay statistics. 

Brief Description of the Drawing 

20 

[0008] 

FIG. 1 is a block diagram of a system on a wide area network showing two redirectors which direct requests from 
groups of clients for hot sites to plural caching servers which are responsible for such hot sites in accordance with 
25 a mapped relationship; 

FIG. 2 is a prior art network flow solution for load balancing for a system containing a single redirector with plural 
caching servers; 

FIG. 3 is a network flow solution for load balancing for a system containing two redirectors; 
FIG. 4 is a graph showing the probability of a server dropping a connection request; 
30 FIG. 5 is a network flow solution of a load sharing problem for the system in FIG. 1 ; and 
FIG. 6 is a flowchart detailing the steps of the present invention. 

Detailed Description 

35 [0009] The present invention is illustrated below in conjunction with exemplary client/server connections estab- 
lished over the Internet using the Transmission Control Protocol/Internet Protocol (TCP/IP) standard. It should be 
understood, however, that the invention is not limited to use with any particular type of network or network communica- 
tion protocol. The disclosed techniques are suitable for use with a wide variety of other networks and protocols. The 
term "web" as used herein is intended to include the World Wide Web, other portions of the Internet, or other types of 

40 communication networks. The term "client request" refers to any communication from a client which includes a request 
for information from a server. A given request may include multiple packets or only a single packet, depending on the 
nature of the request. The term "document" as used herein is intended to include web pages, portions of web pages, 
computer files, or other type of data including audio, video and image data. 

[0010] FIG. 1 shows an exemplary web server system 100 in accordance with an illustrative embodiment of the 
45 invention. The system includes a first redirection server R1 (101) local to local area network 102 and a second redirec- 
tion server R2 (103) local to local area network 104. Local area networks 102 and 104 are separated and connected 
over a wide area network (Internet 105). A plurality of clients. 106-1 — 106-N, are connected to local area network 102 
and a plurality of clients, 107-1 — 107-M, are connected to local area network 104. All requests for web documents 
made by clients 1 06-1 — 1 06-N are passed through their local redirection server 1 01 and redirected to a caching server 
50 on system 100. Similarly, all requests for web documents made by clients 107-1 — 1 07-M are passed through their 
local redirection server 103 and redirected to a caching server on system 100. In the illustrative embodiment, a first 
caching server S1 (110) is connected local to local area network 102, a second caching server S2 (1 1 1) is connected 
to the wide area network Internet 105, and a third caching server S3 (1 12) is connected to local area network 104. In 
addition to caching individual web documents, each of these caching servers is responsible for particular hot sites, or 
55 web sites to which a large number of client requests are directed. Each caching server is responsible for one or more 
of such hot sites and each such hot site may be associated with more than one of the caching servers. 
[001 1] In the system 1 00, communication between clients and caching servers is effected over TCP/IP connections 
established over a network in a conventional manner. Each of the elements of system 100 may include a processor and 
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a memory. The system 100 is suitable for implementing Hypertext Transfer Protocol (HTTP)- based network services 
on the Internet in a manner well known to one skilled in the art. For example, a client may generate an HTTP request 
for a particular web document on a hot site by designating a uniform resource locator (URL) of that document at the 
domain name of the hot site. Such a request is passed through the requesting client's local redirection server and redi- 

5 rected to one of the caching servers, S1 , S2 or S3, which is responsible for that hot site. A TCP/IP connection is then 
established between the requesting client and the particular caching server responsible for that hot site as selected by 
the local redirection server. If the particular requested document is available at that caching server, it is supplied to the 
requesting client. If not, a separate TCP/IP connection is established by the caching server to the actual hot site from 
where a copy of the requested web document is obtained and forwarded to the client. 

io [001 2] In the af orenoted co-pending patent application, a load distribution algorithm is presented for determining an 
initial distribution of a set of documents across servers and a determination of the probabilities, or weights, with which 
a redirector server should redirect requests to those particular servers that contain a replica of a requested document. 
Such load distribution is undertaken to balance the load on the set of servers where the definition of load balance is that 
the sum of access rates of requests to the documents redirected to each of the servers containing the documents is 

is balanced across all the servers. Using as input the access rates of each document, the number of replicas of these doc- 
uments that need to be made on the servers, and the capacity of each server, an initial distribution algorithm, described 
in the aforenoted co-pending application incorporated herein, produces a mapping between the documents and the 
servers as well as producing as output the probabilities (or weights) to be used by a redirecting server to redirect 
requests from/to replicated documents to one of the severs to which they are mapped to achieve the desired load bal- 

20 ance. 

[001 3] FIG. 2 shows a flow network model in the co-pending patent application between a source and a sink for f we 
documents, numbered 1 — 5 distributed on three servers, S1 , S2 and S3, which each have equal scaled capacities of 
.333. The flow network is of a type described in, for example, R. K. Ahuja et al., "Network Flows: Theory, Algorithms and 
Applications", Prentice Hall, 1993, which is incorporated by reference herein. The scaled access rates to each of the 

25 documents, totaling 1 , are shown on the f ive arcs between the source node and the redirector/document nodes. These 
access rates represent the probabilistic proportion of requests arriving at the redirector for each of the five documents. 
As can be noted from the arcs between the redirector/document nodes and the three server nodes, server S1 has 
stored replicas of document 1,2,3 and 4, server S2 has stored replicas of documents 1,2,4 and 5, and server S3 has 
stored replicas of documents 2, 3 and 5. These documents have been distributed in accordance with the initial distribu- 

30 tion algorithm described in the aforenoted co-pending application to assure that at least two replicas of each document 
are stored in the set of three servers. The numbers on the arcs between the redirector/document nodes and the servers 
nodes represent the network flow solution for dividing the incoming access rate to the redirector/document nodes for 
each document to the servers that contain the document so that the total load to the servers S1 , S2 and S3 is balanced. 
In the FIG. 2, the redundant arcs from the servers to the sink represent arcs that have infinite capacity but have a "high" 

35 cost associated with them. The cost on all the other arcs is equal to zero. The "high" cost arcs are used for overflows 
that may result when a change occurs in the system. 

[0014] The distribution represented in FIG. 2 is a maximum-flow minimum-cost solution to the corresponding net- 
work flow problem that models the load balancing problem which must be solved to determine the desired solution. The 
solution is obtained using the mathematical language AMPL as described by R. Fourer et al., "AM PL: A Modeling Lan- 

40 guage for Mathematical Programming," The Scientific Press, 1993, which is incorporated by reference herein, in con- 
junction with a non-linear program solver such as MINOS. The numbers on the arcs between the redirector/document 
nodes and the server nodes represent the portions of the access rates tor each document that are redirected to each 
server that contains the document. Thus, for example 0.175 of the 0.35 access rate requests for document 1 is redi- 
rected to server S1 , and a similar portion is redirected to server S2. Thus, requests for document 1 received by the redi- 

45 rector should be redirected to servers S1 and S2 with equal weights or probabilities. Similarly, 0.1 13 of the 0.5 access 
rate for document 2 should be redirected to server S1 ,0.108 of the 0.5 access rate should be redirected to server S2, 
and 0.279 of the 0,5 access rate should be redirected to server S3. The corresponding weights, or probabilities, for redi- 
rection of a request for document 2 are thus 0.226, 0.216 and 0.558 for servers S1 , S2 and S3, respectively. 
[0015] Although the above example and others in the co-pending patent application illustrate load balancing initial 

so distributions involving a single redirector server which redirects requests to a plurality of document servers on which 
documents are replicated, the single redirector model can be extended to accommodate multiple redirectors. FIG. 3 
models the same web server system as in FIG. 2, but includes two redirectors, as in the system of FIG. 1. In FIG. 3 
redirector/document nodes 1 through 5 now represent requests for documents 1 through 5, respectively, which are redi- 
rected to the servers S1 , S2 and S3 by redirector 1 , and redirector/document nodes 1 ' through 5' represent requests for 

55 documents 1 through 5, respectively, which are redirected to the servers S1, S2 and S3 by redirector 2. The scaled 
access rate of each document is now, however, divided between the two redirectors. As an example, the scaled access 
rate for document 1 , which in FIG. 2 is 0.35, is divided between the two redirectors so that the access rate to document 
1 redirected through redirector 1 is 0.25 and the access rate redirected through redirector 2 is 0.1 . The access rates for 
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document 2 through redirectors 1 and 2 are 0.1 and 0.4, respectively; for document 3 through redirectors 1 and 2 are 
0.04 and 0.01 , respectively; for document 4 through redirectors 1 and 2 they are each 0.02; and for document 5 through 
redirects 1 and 2 they are 0.05 and 0.01 , respectively. In the single redirector model of FIG. 2, equal proximity between 
the redirector and the servers was assumed so that costs on the arcs between the document nodes and the server 

5 nodes were not considered as variable parameters, rt can be assumed, however, as shown in the network of FIG. 1 , 
that the distances, d1j, 1 <j^3, between redirectoM and servers S1, S2 and S3 are such that d1 1 <d12<d13. Thus, 
as in the web network in FIG. 1, SI is local to redirector 1 on local network 102, is separated from server S2 on the 
Internet 105 by a mid-range distance, and is furthest from S3 which is connected on a distant local network 104. Thus, 
for the two-redirector arrangement in FIG. 3, redirector 1 should redirect requests for a document to server S1 if it is 

70 available there and redirect requests for a document to server S2 and then S3 only if the capacity of S1 is exceeded. 
To model this distance factor, costs are assigned to the arcs between the redirector/document nodes and the server 
nodes according to the distance between the redirector and the server. For example, requests for document 2 from redi- 
rector 1 , if redirected to nearby server S1 are designated as having a cost per unit flow of 1 , if redirected to mid-distance 
server S2 are designated as having a cost per unit flow of 2, and if redirected to distant server S3 are designated as 

is having a cost per unit flow of 3. These same costs are assigned between the other redirector/document nodes associ- 
ated with redirector 1 and the three servers. In a similar manner, redirector 2 is considered local to server S3, as shown 
in FIG. 1, is separated from S2 by a mid-range distance, and is separated from S1 by a long distance. In FIG. 3, the 
numerals in the square parentheses represent the costs on the arcs between the redirector/document nodes and the 
server nodes that contain replicas of the documents. 

20 [001 6] The cost on the overflow arcs from the servers to the sink needs to be carefully controlled. If load balance is 
the primary objective, then the costs on these arcs are chosen such that they are larger than the largest cost on the 
arcs that connect the redirector/document nodes to the server nodes. Otherwise, the network solution will lead to redi- 
recting a flow to a close proximity device even if the load balance condition is violated as opposed to redirecting the flow 
to a device that is further away without violating the load balance requirement. For example, if the cost on the overflow 

25 arc from node S1 to the sink is chosen to be between 2 and 3, for all documents that are available in S1 and S3, redi- 
rector 1 will deterministically send the request to S1 possibly overflowing the capacity of server S1 even if server S3 has 
spare capacity. If the cost on the overflow arc is less than 2, requests for documents available on S1 and elsewhere will 
always be sent to S1 even if it overflows the capacity of S1. Where load balance is the primary objective, therefore, the 
costs on the overflow arcs are chosen to be 4, which is larger than the cost on any of the arcs between the redirec- 

30 tor/document nodes and the server nodes. FIG. 3 shows the network flow solution, with flows between redirector/doc- 
ument nodes being specified without any parentheses around them, capacities specified within curved parentheses, 
and, as noted, costs specified within square parentheses. Where the capacity or the cost of an arc is not specified, its 
default value is 1 . Using the determined flow values, the redirection probabilities, or weights, with which a request is for- 
warded to a particular server that has a stored copy of the requested document can be determined. Thus, for example, 

35 the flow on the arc from the redirector 1/document 1 node to S1 specifies that of the 0.25 units of scaled flow for the 
document 1/redirector 1 pair, 0. 1 7 units should be redirected to server S1 . Thus, if a request for document 1 arrives at 
redirector 1, with a probability of 0.17/0.25, it will be directed to server S1 and with probability 0.08/0.25 it will be 
directed to server S2. 

[0017] Unlike the solutions for load balancing focused on by the prior art, the present invention provides a solution 
40 for load sharing which considers the network delay associated with accessing a server and the server delay, which itself 
is a function of the access rate on that server. A network flow approach to this problem is considered after an initial dis- 
tribution is completed for load balance. The aim of this load sharing solution is to minimize the average delay of all con- 
nections across all servers per unit of time. FIG. 1 is the network model of a preferred embodiment of this invention. As 
previously noted, hot sites rather than individual documents are mapped to servers S1, S2 and S3, which, in this 
45 embodiment, are proxy cache devices. The network delay, which is the sum of the round-trip propagation and the trans- 
mission delays, is modeled by specifying them in the network flow model as costs on the arcs from the redirector/logical 
item nodes to the proxy cache server nodes, where the logical items are the different cached hot sites. The links from 
the proxy cache server nodes to the sink have a cost associated with them which is a function of the flow through these 
links. This cost models the device delay. The larger the number of connections a server serves per unit of time, the 
50 larger the device delay for each of the connection requests. The capacities of these links are not now used to specify a 
balance of load among the server devices, but to specify a limit on the average number of connections per time unit sent 
to the server device above which the connections may be dropped. As is described below, the proxy cache server delay 
and the capacity of the proxy cache server are calculated using a queuing model. 

[0018] The caching server is modeled as an M/M/1/K queue to which connection requests arrive with a Poisson 
55 arrival rate. The service time at the device is exponentially distributed. There is one server (CPU) and the maximum 
queue space at the server is K. This maximum queue space specifies the maximum number of simultaneous connec- 
tions that the proxy cache server can support. If the number of simultaneous connection requests is greater than K, 
then requests will be dropped. If it is assumed that X and \i are the expected arrival and service rates, then it can be 
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shown that the probability that there are k jobs in the queuing system is given by: 



10 




o<*<a: 



(i) 



[001 9] Then, the average queue length at the server is given by 



75 



K=0 



Using Little's Law (see L Kleinrock, Queuing Systems, Vol. 1 . Prentice Hall), the average response time or the device 
20 delay at the server can be computed as 



25 



This device delay represents the sum of the queuing delay and the service delay at the server. 
[0020] By letting 



30 



35 



p = 



and using the above equations and after manipulations, it can be shown that a closed form solution for Ft is given by: 



40 [0021 ] It can be noted that when K -> «>, the response time approaches 



45 



50 



55 



which is the response time of a M/M/1 system with infinite queue size. 

[0022] Server capacity is determined by the allowed dropping probability of connection requests. When a connec- 
tion request arrives at a server, the request is dropped if there already are K connections being served by the server. 
The probability that an arriving request sees K existing connections at the server is given by P K . If it is assumed that 
the dropping probability is to be less than P d , then the maximum flow F (and thus the capacity) that is allowed at a 
server is calculated as P^Pd- From the equation for P K , it is found that 

P = p*d-p) 



1-P 



F is substituted for X in P K and this valued is bounded to be within P d Solving for F then provides the maximum flow 
that can be sent to a server. From the following equation, F can be computed: 
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[0023] The load sharing algorithm is illustrated using an example applied to the proxy caching hot site network 
arrangement of FIG. 1. The queuing module previously discussed is used to compute the server delay at each proxy 
75 caching server. This server delay will be a function of the amount of flow sent to that server (which represents the 
number of connection requests per time unit sent to that server). The maximum flow that can be sent to a server (which 
represents the maximum rate at which connection requests are redirected to a server) such that the specified condition 
on the connection dropping probability is satisfied is also computed. 

[0024] In this example, the maximum number of simultaneous connections that can be handled by a proxy caching 
20 server (K) is set to 50. This means that if the proxy caching server is handling HTTP requests, it can handle 50 such 
requests simultaneously. It drops extra connection requests that come in if it is already handling 50 requests. The max- 
imum connection loss probability, P dt is set to be 0.0001 , which means that on an average only t in 10000 connections 
is not completed and is dropped at the server. FIG. 4 is a plot of the connection loss probability P K versus r = F/\i with 
K = 50. From the plot it can be seen that for r <, 0.865, P K < 0.0001 . That is, for F < 0.865 P K < Pd- Depending on 
25 the service rate \i at a server, the maximum flow F that can be sent to that server can be computed. The service rate u 
at a server specifies the rate at which HTTP connection requests are satisfied. Assuming that on an average each 
object is 4K bytes and that proxy servers are connected through a 10 Mbit/sec ethernet, it can be reasonably assumed 
that the bottleneck is the bandwidth of the network and not the CPU performance. Then, to transmit a 4K byte object 
over a 10 Mbit/sec network requires 3.2 x 10* 3 seconds, which means that the service rate *i is 

30 

— = »312. 

3.2X10" 3 

35 Thus, the proxy cache server can handle approximately 312 HTTP connection requests per second. If it can be 
assumed for this example that all servers can handle approximately 312 connections per second, then F should be 
found by 0.865 x 31 2 « 269 at all servers. The server delay as a function of the flow X sent to the cache is given by Equa- 
tion (1) above. Given the above parameters, the server delay is given by: 



45 




For purposes of the example, it is assumed that the network delay from redirector 101 to nearby caching server S1 on 
so the same local network is 10 ms, to mid-distance caching server S2 on the Internet 105 is 40 ms, and to caching server 
S3 on the distant local network 1 04 is 200 ms. Similarly, the network delay from redirector 1 03 to nearby caching server 
S3 is assumed to be 10 ms, to mid<Jistance caching server S2, 40 ms, and to distant caching server S1, 200 ms. 
[0025] FIG. 5 illustrates a network flow model 500 of the server system 1 00 in FIG. 1 in which the parameter values 
noted above have been incorporated. From the initial distribution, caching server S1 is responsible for hot sites 1 , 2, 3 
55 and 5; caching server S2 is responsible for hot sites 1 , 3, 4 and 5; and caching server S3 is responsible for hot sites 2, 
3 and 5. This mapping is noted in FIG. 1 within the S1, S2 and S3 caching servers. This is parallel to the mapping of 
documents on the three servers used in conjunction with the description of FIG. 2 for load balancing. From source 501 , 
a total of 800 connection requests per second are generated for all hot sites. The requests flowing through redirector 



7 



BNSDOCID: <EP 1035703 A 1_l_> 



v 



EP 1 035 703 A1 

101 are represented by the arcs between the source 501 and the redirector/hot site nodes 1 —5 and the requests flow- 
ing through redirector 103 are represented by the arcs between the source 501 and the redirector/hot site nodes V — 
5'. The numbers associated with the arcs from source 501 to the redirector/hot site nodes 1 —5, and V —5* specify 
the connection requests flowing through each respective redirector to each hot site. The cost representing the network 

s delays are shown in the square brackets on the arcs between each redirector/hot site node and the server nodes. Each 
caching server is assumed, as noted above, to support up to 50 connections per time unit without dropping any con- 
nections. If more requests are directed to these devices, then some connections may be dropped. All overflow connec- 
tions are handled through the overflow arcs between the server nodes and the sink 502. The cost parameter chosen for 
these arcs (represented in FIG. 5 as infinity M) should be chosen to be a larger number than the cost on all other arcs. 

10 The server delay at the proxy caches, shown as the cost on the arcs between each proxy cache server and the sink 
502, is given by R(^), R(7L 2 ), and R(3i 3 ), where Xj is equal to the connection requests directed to caching server Si and 
R(Xi) is calculated using equation (4). 

[0026] The model in FIG. 5 is salved using the aforenoted AMPL mathematical programming language and the 
MINOS non-linear program solver with a goal of finding a solution to the non-linear problem such that the average 

is response time for all of the 800 connections arriving per second is minimized. AMPL is a mathematical programming 
language that can be used to specify different optimization problems. It is used here to specify a minimum cost network 
flow optimization problem. For example, the load sharing network flow model of FIG. 5 can be modeled using the pro- 
gram shown in Appendix 1 . The program defines the nodes in lines 101—1 03, the connectivity of the node in lines 1 04 
— 106, specifies that the flow in must equal the flow out at line 107, specifies various criteria line 108-1 13, the function 

20 to be minimized in line 1 14 (postulated in the program as a minimization of total cost), and the different constraints of 
the problem in lines 1 15 — 1 18. The different parameters of this network, such as the number of hot sites, the number 
of redirectors and the number of proxy cache servers can be varied by specifying them in a data file with which the pro- 
gram is associated. Further, the input parameters of the model such as access rates, delays and capacities are also 
specified in this data file. The data file for the example in FIG. 5 is shown in Appendix 2. The AMPL environment takes 

25 the model file and data file and uses one of several different solvers to solve the optimization problem depending on the 
nature of the problem. As the specified problem is a non-linear optimization problem, the MINOS solver is used to solve 
the problem. The output of the solver provides the flow on each link between the redirector/hot-site nodes to the server. 
These flow values, noted on these arcs are then used by the redirectors to determine the probability or weights with 
which a request for a hot site is redirected by that redirector to a particular proxy cache server that is responsible for 

30 that requested hot site. 

[0027] With reference to the mathematical example shown in FIG. 5, of the 200 requests per second that arrive for 
hot site 1 at redirector 1, the solution indicates that 123 requests are redirected to caching server S1 and 77 requests 
are redirected to caching server S2. This means that when a subsequent request for hot site 1 arrives at redirector 1, it 
should be redirected to server S1 with a probability of 123/200 and to S2 with a probability of 77/200. Similarly, when a 

35 subsequent request for hot site 5 arrives at redirector 2, it is redirected to server S2 with a probability of 33/56 and to 
server S3 with probability 23/56. The solution illustrated in FIG. 5 results in all the other requests arriving at a redirector 
being redirected to the closest server. All the requests that do arrive at server S2 in the example are those that cannot 
be served at a closer server because a) the request hot site is not cached at the closer server; or b) redirecting the 
request to the closer server will violate its capacity requirement. The flow, the X values, going into each server are 

40 shown on the arcs from the server nodes to the sink node 502. The capacity of each such node is shown on these same 
arcs in the curved parentheses. As can be noted, servers S1 and S3 are filled to their capacity while server S2 still has 
some spare capacity available. 

[0028] It can be noted from FIG. 5 that the load is now not balanced but shared among the servers such that the 
average delay is minimized. From the solution, the total minimized delay of all the 800 connections is calculated to be 
45 15860 ms for an average delay of 19.825 ms per connection. The number of connections redirected to each device is 
noted to be less than or equal to the maximum number of connections that can be handled and thus the condition on 
the probability of dropped connections is satisfied. Thus, as can be noted in FIG. 5, there is zero flow in the overflow 
arcs between the server nodes and sink 502. 

[0029] In accordance with the embodiment of the present invention in FIG. 1, a connection management station 
so (CMS) 115 performs a network flow computation to calculate the weights with which redirectors 101 and 103 will redi- 
rect further incoming requests for one of the five numbered hot sites. Once the calculation is performed, the resultant 
weight values are sent back to the appropriate redirector over, for example, a TCP/IP connection. CMS 1 15 is shown 
connected to the Internet 105 but in actuality can be connected anywhere on the network in FIG. 1, such as on local 
network 1 02 or on local network 1 04. In order for the CMS 1 1 5 to perform a network flow calculation it periodically col- 
55 lects access rate and network delay information from redirectors 101 and 103 and server delay information from the 
caching proxy servers S1 , S2 and S3. 

[0030] Access rate information to each hot site is determined by each redirector by associating the destination 
address in the SYN packets with a set of hot site IP addresses. Alternatively, this information can be collected by exam- 
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ining, at the redirectors. the HOST field in the GET packets. 

[0031] In the network configuration of FIG. 1 in which redi rector 101 is local to clients 106-1 — 106-Nand redirector 
103 is local to clients 107-1 — 107-M, the redirectors do not have to treat traffic from different local clients in different 
manners. Network delay can therefore be accounted for by just considering the delay from a each redirector to each of 

s the different caching servers. For a symmetric flow of traffic in which packets from the clients to the caching servers and 
from the caching servers to the clients flow through the redirector, network delay can be tracked by each redirector by 
computing the time between redirecting a SYN packet to a particular caching server and receiving the corresponding 
SYN ACK packet back from that server. In an asymmetric flow of traffic in which the client-to-server traffic flows through 
the redirector but the reverse traffic flows directly from the server to the client, the SYN ACK packet will not flow through 

w the redirector. Therefore, network delay can be measured with another mechanism such as by PINGing the servers 
periodically. 

[0032] Server delay at each of the caching proxy servers S1, S2 and S3 is calculated using the aforedescribed 
queuing model which resulted in equation (4) from which the server delay R(X$ is determined as a function of the flow 
Xj into server Si. 

15 [0033] Once an initial distribution of hot sites onto caching servers is performed based on access rate information 
to achieve load balancing in the manner specified in the co-pending application, CMS-145 performs a network flow com- 
putation for purposes of load sharing to determine the weights with which the redirectors should redirect requests to 
replicated hot site caches. Using these determined weights at each redirector minimizes the average delay of all con- 
nections across all of the caching servers per unit of time. This load sharing network flow computation is continually 

20 updated by periodically collecting current access rate and network delay information from each redirector and server 
delay information from the cache servers. Thus, the weights are continually updated based on the latest access rate, 
network delay and server delay information. Further, if the CMS 1 15 detects a failure of a caching server, it will trigger 
a network flow computation. In this case, the corresponding node is removed from the network flow model as well as all 
arcs incident upon it. Further, the service rate, ji, may change at a caching server it for example, one of two processors 

25 fails or if a device is replaced by another device with a higher performance CPU. This will affect two parameters: a) the 
delay at a server given that a specific number of connections are redirected to that server; and b) the capacity of the 
server in terms of the maximum flow that can be redirected to that server. As a result of a service rate change, a network 
flow computation can be triggered. Even further, if the access rate to a particular hot site suddenly and dramatically 
changes, a redirector will trigger a network flow computation. In this case, the flows on the arcs from the source to the 

30 redirector/hot site nodes are changed on the network flow model. Changes in other parameters can also affect the net- 
work flow computation. Thus, if the relative round-trip delay from the redirectors to the caching servers changes, the 
costs associated with the arcs from the nodes representing the redirectors to the server nodes are changed. Also, since 
server load and server delay are determined by the number of requests redirected to a caching server, a change in such 
number of requests per second redirected to a server will change the server delay parameter. 

35 [0034] FIG. 6 is a flowchart detailing the steps of the method of the present invention. At step 601 , access rate infor- 
mation is obtained by CMS 1 15 from each redirector. At step 602, using this access rate information, an initial distribu- 
tion of hot sites on the caching servers is determined using the prior art load balancing network flow algorithm. Once 
the initial distribution is determined, at step 603, CMS 115 obtains current access rate and network delay information 
for each redirector, and the server delay of each server is calculated. Using these inputs, at step 604, a network flow 

40 problem for load sharing is solved. At step 605, the probabilities (or weights) for each redirector for each hot site pair 
are determined and sent to each redirector. At decision step 606, a determination is made whether there has been an 
access rate change at a redirector. If yes, an update is triggered at step 607, which in turn causes the current access 
rate, network delay, and server delay to be determined or calculated back at step 603. Similarly, at decision step 608, a 
determination is made whether a server failure is detected. If yes, an update is triggered again at step 607. Further, at 

45 decision step 609, a determination is made whether or not a change in the delay at a caching server is detected. If yes, 
an update is triggered at step 607. If an access rate change, a server failure, or a caching server delay change are not 
detected at either decision steps 606, 608 or 609, respectively, then, at decision step 610, a determination is made 
whether the elapsed time since the last update has exceeded a threshold period of time, T If not, the flow returns to the 
inputs of decision steps 606, 608 and 609. If the elapsed time has exceeded T, then an update is triggered at step 607. 

so [0035] In the described embodiment, CMS 115 performs a centralized data gathering and network flow analysis 
function. Such functions could alternatively be performed at either redirector through TCP/IP communication between 
both such redirectors for exchanging access rate and delay information. Further, server delays need to be communi- 
cated to the redirectors. 

[0036] Although described in conjunction with a system designed to achieve load balance across plural caching 
55 servers containing replicated hot sites for which in the network flow model the logical item at each redirector/Iogical item 
node represents a hot site, the present invention is not limited to such an arrangement. Thus, rather than having the 
logical items which are mapped to different servers being hot sites, as described above, the logical items could be any 
group of hot documents which are mapped onto a plurality of local caching servers in accordance with the document's 
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origin server IP addresses. The origin server IP addresses of the documents are hashed to different groups. The differ- 
ent groups are then considered the logical items which are mapped onto the plural caching servers. Since many origin 
servers use multiple IP addresses, to avoid the same origin server name from being mapped to multiple caches, the 
hashing function is chosen so that the all IP addresses for an origin server are mapped to the same group out of a pos- 
sible 256 different groups. Since origin servers normally use a contiguous block of IP addresses, then if the hashing is 
based on the first 8 bits of the origin server IP address, this contiguous block of IP address will be automatically mapped 
to the same group. 

[0037] Alternatively, the logical items could be any group of hot documents identified by the URLs, as might be done 
if a content-smart switch is used as the redirector. For this case, documents may be grouped according to the type of 
objects requests, such as .html, .jif, .jpeg, etc. 

[0038] Another way to perform the group mapping could be based on the logical names of the origin servers. Dif- 
ferent logical names could be mapped to different groups. This requires looking at the HOST field found in the GET 
packets to keep track of the access rates to the different servers, and thereby to the groups. The initial distribution algo- 
rithm can then be used to decide where the logical items should be distributed. Alternatively, the initial distribution can 
be performed based on other information such as the proximity of the cache to the clients requesting a specific docu- 
ment. Once the initial distribution is performed by either these method, or by another method, the load sharing algorithm 
of the present invention is performed to minimize the average delay of all connections across all of the caching servers 
per unit of time. Thus, access rate information to each formed group, network delay and server delay are determined 
as inputs to the network flow problem. From the network flow solution to the non-linear optimization problem, the opti- 
mum weights for each redirector/group of documents for the desired load sharing are determined. These weights, then 
determine the probabilities-with which the redirector directs a request for a document within one of the replicated logical 
items, the latter being one of the formed groups of origin server IP addresses. 

[0039] In the embodiments discussed hereinabove, it has been assumed that the clients are local to one of the redi- 
rectors. Therefore, the network delay between the client and the redirector has not been considered as a factor in the 
solution to the load sharing problem. The other possibility is for the redirectors to be closer to a plurality of servers. In 
this scenario, a client's request for a logical name is resolved by a Domain Name Server into a redirector's IP address, 
which in turn redirects the request to a server in a cluster of essentially duplicated servers. In this case, server side load 
balancing is achieved, in accordance with the present invention, so as to minimize the average delay per unit of time of 
all requests of all connections across all the servers in the cluster. In such an embodiment, client IP addresses are 
mapped into groups by a simple hashing function. As an example, the first 5 bits of the client IP addresses can be used 
to determine which of 256 client groups a client is mapped into. Other hashing functions could be found that evenly dis- 
tribute the clients among the groups such that the access rate of clients allocated to each of the groups are evenly dis- 
tributed among the groups. Regardless of the hashing function, each group becomes a logical item which is mapped 
onto the physical devices, which are, in this case, the back-end servers. The initial distribution algorithm can be used to 
map these logical items to the servers. The network flow based algorithm can then be used to compute the redirector 
probabilities associated with each rediredor/logical item pair, which in this case is a redirector/client group pair. In the 
network flow model then, the arcs between the source and each redirector/client group pair represent the requests gen- 
erated from each group that need to be redirected to one of the back-end servers. The load sharing solution to the net- 
work flow problem, in accordance with the present invention, will produce the weights with which the redirectors direct 
requests from each of the client groups to the plural servers. The cost associated with each arc represents the delay. 
Thus, the cost on an arc between the source and one of the redirector/client group nodes represents the network delay 
encountered between a client group and the redirector, while the cost on an arc between a redirector/client group node 
and a server represents network delay between the redirector and one of the servers, plus that server's delay. In the hot 
site embodiment of the present invention previously described, it was assumed that the clients and the redirectors were 
local to one another so that the delay between each was a constant small value, and thus not a variable. With server 
side load sharing, this delay is certainly a variable that is considered in the network flow load sharing solution. 
[0040] If the redirector is local to the back-end servers, then the network delay computation does not have to 
include the delay from the redirector to the servers. By grouping the clients together in groups according to their IP 
addresses, the clients within each group are likely to be geographically proximate. Thus, the network delay can be 
determined for each group of clients and a specific server by determining the network delay from each client group to 
the redirector. For a symmetric traffic flow, the delay from the client group to the redirector can be calculated at the redi- 
rector by keeping track of the time between redirecting a SYN ACK packet from the server to the client group and the 
time an ACK packet for the SYN ACK packet is received at the redirector as part of the three-way TCP handshake. With 
an asymmetric traffic flow model, the SYN ACK packet does not flow through the redirector. In this case, the redirector 
can periodically PING one or more clients in a client group to calculate the delay from that client group to the redirector. 
Further, the redirector separately keeps track of the access rates from each client group in order to solve the network 
flow for load sharing. 

[0041] H the redirector, rather than being local to the back-end servers, functions as a virtual server which, upon 
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receipt of a request, redirects the request to one of a plurality of servers located in different locations on the wide area 
network, then the delay between the redirector and each server also needs to be taken into account in addition to the 
delay between each client group and the redirector. If the traffic flow is symmetric, the redirector can calculate the net- 
work delay from a client group to a server by adding the delay from the client group to the redirector to the delay from 

5 the redirector to the server. The delay from the redirector to the server can be calculated as previously described by 
keeping track of the time between redirecting a SYN packet to a server and receiving the corresponding SYN ACK 
packet. As before, the delay from the client group to the redirector can be calculated at the redirector by keeping track 
of the time between redirecting the SYN ACK packet from the server to the client group and the time an ACK packet for 
this SYN ACK packet is received at the redirector. If the traffic flow is asymmetric, the delay on the traffic that flows 

jo directly from the server to the client is of interest. The client to server delay is not as an important parameter since most 
of the HTTP traffic flows from the servers to the clients. The mechanism used for the symmetric flow case can be used 
as an approximation for the server-to-client delay or it can be estimated by the redirector using geographical informa- 
tion, such as that provided by IANA as used in Classless Inter-Domain Routing Protocol. 

[0042] The network flow model can thus be used to solve the load sharing problem when the logical item in the redi- 
15 rector/logical item pair is associated with the clients making requests to a server or the servers providing responses to 
requests from clients. Further; the network model is flexible to provide a solution when a delay is associated with the 
link between a client group and the redirector, on the link between the redirector and the server, or on both links. 
[0043] The foregoing therefore merely illustrates the principles of the invention. It will thus be appreciated that those 
skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, 
20 embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and con- 
ditional language recited hereinabove are principally intended expressly to be only for pedagogical purposes to aid the 
reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the 
art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, 
all statements hereinabove reciting principles, aspects, and embodiments of the invention, as well as specific examples 
25 thereof, are intended to encompass both structural and functional equivalents thereof Additionally, it is intended that 
such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any ele- 
ments developed that perform the same function, regardless of structure. 

[0044] Thus, for example, it will be appreciated by those skilled in the art that the block, diagrams and flowcharts 
described hereinabove represent conceptual views of illustrative circuitry and processes embodying the principles of 

30 the invention. Similarly, it will be appreciated that any flowcharts, flow diagrams, state transition diagrams, pseudocode, 
and the like represent various processes which may be substantially represented in computer readable medium and so 
executed by a computer or processor, whether or not such a computer or processor is explicitly shown. 
[0045] The functions of the various elements shown in the FIGS., including functional blocks labeled as "proces- 
sors" may be provided through the use of dedicated hardware as well as hardware capable of executing software in 

35 association with appropriate software. When provided by a processor, the functions may be provided by a single dedi- 
cated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. 
Moreover, explicit use of the term "processor or "controller should not be construed to refer exclusively to hardware 
capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, 
read-only memory (ROM) for storing software, random access memory (RAM), and non-volatile storage. Other hard- 

40 ware, conventional and/or custom, may also be included. Similarly, any switches shown in the FIGS, are conceptual 
only. Their function may be carried out through the operation of program logic, through dedicated logic, through the 
interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the 
implement as more specifically understood from the context. 

[0046] in the claims hereof any element expressed as a means for performing a specified function is intended to 
45 encompass any way of performing that function including, for example, a) a combination of circuit elements which per- 
forms that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with 
appropriate circuitry for executing that software to perform the function. The invention as defined by such claims resides 
in the fact that the functionalities provided by the various recited means are combined and brought together in the man- 
ner which the claims call for. Applicant thus regards any means which can provide those functionalities as equivalent to 
so those shown hereinabove. 
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APPENDIX 1 



101 set NODES; 

102 set SINK; 

103 set D JJODES; 

1 04 set LINKS within {NODES cross NODES}; # Links connect nodes to nodes 

105 set SPJ-INKS within {NODES cross SINK}; set DELINKS within {NODES cross D_NODES}; 

1 06 set D2_LINKS within {D JJODES cross SINK}; 

param supply {NODES} >= 0; 
param demand {NODES} >= 0; 
param supplyl {SINK} >= 0; 
param demandl {SINK} >= 0; 
param suppiy2 {D_NODES} >= 0; 
param demand2 {D_NODES} >= 0; 

107 check- sum{i in NODES} supplyp] + sum{i in SINK} supply1[il + sum{i in D.NODES} supply2[i] 

= sum {j in NODES} demandffl + sum {j in SINK} demandl [j] + sum {j in D_NODES} demand2[fl; 

# Check that the total demand at all 

# nodes is less than the total 

# supply available at all nodes 



param cost {LINKS} >= 0; 
param capacity {LINKS} >= 0; 
param costl {SPJJNKS} >= 0; 
param maxconnections {SPJJNKS} >= 0; 
param probability {SPJJNKS} >= 0; 
param cost2 {D1 JJNKS} >= 0; 
param capacity2 {DELINKS} >= 0; 
param cost3 {D2_LINKS} >= 0; 
param capacity3 {D2 JJNKS} >= 0; 



1 08 var Ship {(i j) in LINKS} >= 0, <= capacity[i j]; # The amount that is shipped through a link 

# should be greater than zero and less 

# than the capacity of the link 

109 var Shipl {(ij) in SPJJNKS} >= 1, <= costlp, 

110 var Ship2 {(ij) in D1 LINKS} >= 0, <= capacity2[i j]; 

111 var Ship3 {(i.j) in D2 LINKS} >= 0, <= capacity3p j]; 

112 var R {(i,j) in SP LINKS} = Ship1[i,j]/cost1 [i j]; 

113 var P {(ij) in SPJJNKS} = 
((RlUHmaxconnectionsW 
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# Minimize the total cost in shipping 
# in the whole network over all the links 

5 

minimize Total_Cost: 

1 1 4 sum{(i j) in LINKS} cost(i j] * Ship[i,j] ♦ 

sum{(i.j) in SPJJNKS} (1/Ship1[ij]) * (R[ij]/(1-R[ij])) * (1- 
(((maxconnections[ij]+1)*(R[ijI"maxconnertions[ijj)*(1-R[i,fl))/(1- 
10 (R[i,jnmaxconnections[i,j]+1 ))))) + 

sum{(ij) in D1JJNKS} cost2[iJJ * Ship2[i,j] + 
sumtfij) in D2JJNKS} cost3[i,j) * Ship3[i.j] ; 

# The flow condition is that at each node, the flow in + supply at that node should 

15 

# be smaller or equal to the flow out + the demand at that node 



subject to Balance 1 {k in NODES}: # The flow condition at each node should 

1 1 5 supply[k] + sum<(i,k) in LINKS} Ship[i,k] # be satisfied 

20 <= demandfk] + sum{(kj) in LINKS} Ship[kj] + sum{(kj) in SPJJNKS} Ship1[kj] 

+sum{(kj) in D1JJNKS} Ship2[k.j]; 

subject to Balance2 {k in D_NODES}: 

1 16 supply2(k] + sum{(i t k) in D1JJNKS} Ship2[i,k] <= demand2[k] + sum{(k,j) in D2_LINKS} 
25 Ship3[kj]; 

subject to Balance3 {k in SINK): 

117 supplyllk] + sum{(i,k) in SPJJNKS} Ship1[i,k] + sum{(i,k) in D2_LINKS} Ship3[i,k] <= 
demand 1[k]; 

so subject to Capacrtyconstraint {(i.k) in SP_LINKS}: 

118 P[i,kJ <= probability!**]; 
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APPENDIX 2 



set NODES := HSA1A1 HSA1A2 HSA1A3 HSA1A4 HSA1A5 HSA2A1 HSA2A2 HSA2A3 
HSA2A4 HSA2A5 CSA1 CSA2 CSA3; 

set SINK Sink; 

set D..NODES := NOPA1 NOPA2 NOPA3; 

param supply default 0 := HSA1A1 200 HSA1A2 60 HSA1A3 24 HSA1A4 62 HSA1A5 30 

HSA2A160 HSA2A2 240 HSA2A3 6 HSA2A4 62 HSA2A5 56; 

param demand default 0; 

param supptyl default 0; 

param demandl default 0 := Sink 800 ; 

param supply2 default 0; 

param demand2 default 0; 

set LINKS := (HSA1A1 ,CSA1) (HSA1A1.CSA2) (HSA1A2.CSA1) (HSA1A2.CSA2) 
(HSA1A2.CSA3) (HSA1A3,CSA2) (HSA1A3.CSA3) (HSA1A4.CSA1) (HSA1A4.CSA2) 
(HSA1A5.CSA1) (HSA1A5.CSA3) (HSA2A1.CSA1) (HSA2A1.CSA2) (HSA2A2.CSA1) 
(HSA2A2.CSA2) (HSA2A2.CSA3) (HSA2A3,CSA2) (HSA2A3.CSA3) (HSA2A4.CSA1) 
(HSA2A4.CSA2) (HSA2A5.CSA1) (HSA2A5.CSA3) ; 

set SPJJNKS 

set D1.LINKS 

setD2 LINKS 



= (CSAI.Sink) (CSA2,Sink) (CSA3 ( Sink) ; 

= (CSA1.NOPA1) (CSA2.NOPA2) (CSA3.NOPA3) ; 

= (NOPA1 ( Sink) (NOPA2,Sink) (NOPA3 f Sink) ; 



param: cost capacity : 

HSA1A1 CSA1 10 600 
HSA1A1 CSA2 40 800 
HSA1A2 CSA1 10 800 
HSA1A2 CSA2 40 800 
HSA1A2 CSA3 200 800 
HSA1A3 CSA2 40 800 
HSA1A3 CSA3 200 800 
HSA1A4 CSA1 10 800 
HSA1A4 CSA2 40 800 
HSA1A5 CSA1 10 800 
HSA1A5 CSA3 200 800 
HSA2A1 CSA1 200 800 
HSA2A1 CSA2 40 800 
HSA2A2 CSA1 200 800 
HSA2A2 CSA2 40 800 
HSA2A2 CSA310 800 
HSA2A3 CSA2 40 800 
HSA2A3 CSA310 800 
HSA2A4 CSA1 200 800 
HSA2A4 CSA2 40 800 
HSA2A5 CSA1 200 800 
HSA2A5 CSA3 10 800 
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param: costl maxconnections probability : 

CSA1 Sink 312 50 0.000100 
5 CSA2 Sink 312 50 0.000100 

CSA3 Sink 312 50 0.000100 



param: cost2 capacity2 : 

CSA1 NOPA1 10000 800 
CSA2 NOPA2 10000 800 
CSA3 NOPA3 10000 800 



param: cost3 capacrty3 : 

NOPA1 Sink 10000 800 
NOPA2Sink 10000 800 
NOPA3Sink 10000 800 



25 

Claims 

1 . A method of processing client requests through at least one redirector to a plurality of servers connected on a com- 
munications network to minimize an average delay associated with the client requests, at least some of the client 

so requests being capable of being satisfied by more than one of the servers, the method comprising the steps of: 

a) determining an access rate of requests associated with each of a plurality of redirector-logical item pairs; 

b) determining a network delay between each of a plurality of clients and the plurality of servers; 

c) determining a server delay incurred in processing a client request at each of the plurality of servers; 

35 d) using the determined access rates of requests in step a), the network delays determined in step b) and the 

server delays determined in step c) as inputs, solving a non-liner program optimization problem to determine 
a set of weights associated with each of the plurality of redirector-logical item pairs so as to minimize the aver- 
age delay associated with the client requests; and 

e) probabilistically forwarding a client request through the at least one redirector to a server that can satisfy that 
40 request using the determined weights associated with the redirector-logical pair item. 

2. In a system which processes client requests through at least one redirector to a plurality of servers connected on 
a communications network, at least some of the client requests being capable of being satisfied by more than one 
of the servers, apparatus for minimizing an avenge delay associated with the client requests, the apparatus com- 

45 prising: 

means for determining an access rate of requests associated with each of a plurality of redirector-logical item 
pairs; 

means for determining a network delay between each of a plurality of clients and the plurality of servers; 

so means for determining a server delay incurred in processing a client request at each of the plurality of servers; 

means for solving a non-linear programming optimization problem to determine a set of weights associated 
with each of the plurality of redirector-logical Hems pairs so as to minimize the average delay of client requests 
using the determined access rate of requests, the determined network delays, and the determined server 
delays as inputs to the problem; and 

55 means for probabilistically forwarding a client request through the at least one redirector to a server in the sys- 

tem that can satisfy that request using the determined weights associated with the redirector-logical pair item. 

3. Apparatus as claimed in claim 2 wherein the means for solving a non-linear optimization problem comprises means 
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for formulating and solving a minimum cost network flow problem. 

4. Apparatus as claimed in claim 3 further comprising means for determining an initial distribution that maps logical 
items onto the servers. 

5. Apparatus as claimed in claim 3 wherein the means for solving the non-linear optimization problem periodically 
determines a new set of weights associated with each redirector-logical item pair to thereafter be used by the at 
least one redirector, or determines a new set of weights when a change of an access rate at a redirector is 
detected, or determines a new set of weights when a server failure is detected, or determines a new set of weights 
when a change in network delay is detected, or determines a new set of weights when a change in the delay at a 
server is detected. 

6. Apparatus as claimed in claim 3 further comprising means for forwarding the determined weights to the at least one 
redirector. 

7. In a system which processes client requests through at least one redirector to a plurality of servers connected on 
a communications network, at least some of the client requests being capable of being satisfied by more than one 
of the servers, a method of determining a set of weights with which the at least one redirector will probabilistically 
forward client requests to the server in the system that can satisfy the requests comprising the steps of: 

a) determining an access rate of requests associated with each of a plurality of redirector-logical item pairs; 

b) determining a network delay between each of a plurality of clients and the plurality of servers; 

c) determining a server delay incurred in processing a client request at each of the plurality of servers; and 

d) using the determined access rates of requests in step a), the network delays determined in step b) and the 
server delays determined in step c) as inputs, solving a non-linear program optimization problem to determine 
the set of weights associated with each of the plurality of redirector-logical item pairs so as to minimize the 
average delay associated with the client requests. 

8. A method as claimed in claim 1 or 7 wherein the step d) comprises the step of formulating and solving a minimum 
cost network flow problem. 

9. A method as claimed in claim 8 further comprising the step of first determining an initial distribution that maps log- 
ical items onto the servers. 

10. A method as claimed in claim 8 wherein the logical items are a plurality of hot sites. 

1 1 . A method as claimed in claim 10 wherein the servers are caching servers that each replicate at least one of the hot 
sites. 

12. A method as claimed in claim 8 wherein the logical items are groups of clients. 

13. A method as claimed in claim 12 wherein the groups of clients are determined by their IP addresses. 

14. A method as claimed in claim 8 wherein the logical items are groups of documents. 

15. A method as claimed in claim 14 wherein the groups of documents are determined by their origin server IP 
addresses. 

16. A method as claimed in claim 14 wherein the at least one redirector is a content-smart switch and the groups of 
documents are determined by their URLs. 

17. A method as claimed in claim 13 or 15 wherein the servers are web servers. 

18. A method as claimed in claim 13 or 15 wherein the servers are caching servers. 

19. A method as claimed in claim 8 wherein the at least one redirector is a virtual server for a plurality of web servers. 

20. A method as claimed in claim 8 wherein steps a) through d) are periodically repeated to determine a new set of 
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weights associated with each redirector-logical item pair to thereafter be used by the at least one redirector. 

21. A method as claimed in claim 8 wherein steps a) through d) are repeated when a change of an access rate of 
requests at a redirector is detected, or when a server failure is detected, or when a change in network delay is 

5 detected, or when a change in the delay at a server is detected. 

22. A method as claimed in claim 8 wherein a central management station (CMS) connected on the communications 
network collects the determined access rates of requests in step a) and the network delay in step b) from the at 
least one redirector, and the server delay in step c) from the plurality of servers, and then performs step d), the 

10 method then further comprising the step of forwarding the determined weights to the at least one redirector. 

23. A system for processing client requests to a plurality of servers connected on a communications network, at least 
some of the client requests being capable of being satisfied by more than one of the servers, the system compris- 
ing: 

15 

a least one-redirector;-and — 

at least one processor performing the steps of: 

a) determining an access rate of requests associated with each of a plurality of redirector-logical item 
20 pairs; 

b) determining a network delay between each of a plurality of clients and the plurality of servers; 

c) determining a server delay incurred in processing a client request at each of the plurality of servers; and 

d) using the determined access rates of requests in step a), the network delays determined in step b) and 
the server delays determined in step c) as inputs, solving a non-linear program optimization problem to 

25 determine a set of weights associated with each of the plurality of redirector-logical item pairs so as to min- 

imize the average delay associated with the client requests; 

the at least one redirector using the determined weights associated with each redirector-logical item pair to 
probabilistically forward each client request to a server that can satisfy that request. 

30 

24. A system as claimed in claim 23 wherein the at least one processor performs step d) by formulating and solving a 
minimum cost network flow problem. 

25. A system as claimed in claim 24 wherein the at least one processor first performs a step of determining an initial 
35 distribution that maps logical items onto the servers. 

26. Apparatus as claimed in claim 3 or a system as claimed in claim 24 wherein the logical items are a plurality of hot 
sites. 

40 27. Apparatus or a system as claimed in claim 26 wherein the servers are caching servers that each replicate at least 
one of the hot sites. 

28. Apparatus as claimed in claim 3 or a system as claimed in claim 24 wherein the logical items are groups of clients. 

45 29. Apparatus or a system as claimed in claim 28 wherein the groups of clients are determined by their IP addresses. 

30. Apparatus as claimed in claim 3 or a system as claimed in claim 24 wherein the logical items are groups of docu- 
ments. 

so 31. Apparatus or a system as claimed in claim 30 wherein the groups of documents are determined by their origin 
server IP addresses. 

32. Apparatus or a system as claimed in claim 30 wherein the at least one redirector is a content-smart switch and the 
groups of documents are determined by their URLs. 

55 

33. Apparatus or a system as claimed in claim 29 or 31 wherein the servers are web servers. 

34. Apparatus or a system as claimed in claim 29 or 31 wherein the servers are caching servers. 
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35. Apparatus as claimed in claim 3 or a system as claimed in claim 24 wherein the at least one redirector is a virtual 
server for a plurality of web servers. 

36. A system as claimed in claim 24 wherein the at least one processor performs steps a) though d) periodically to 
5 determine a new set of weights associated with each redirector-logical item pair to be thereafter used by the at least 

one redirector. 

37. A system as claimed in claim 24 wherein the at least one processor repeats steps a) though d) when a change of 
an access rate of requests at a redirector is detected, or when a server failure is detected, or when a change in 

w network delay is detected, or when a change in the delay at a server is detected. 
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